🐿️ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
🗣️ CMU Pronouncing

Phonetic Dictionaries, Speech Synthesis, Linguistic Resources, Audio Processing

Zero-shot Context Biasing with Trie-based Decoding using Synthetic Multi-Pronunciation
arxiv.org·9h
🎙️Whisper
VibeVoice (1.5B) - TTS model by Microsoft
huggingface.co·20h·
Discuss: Hacker News, r/LocalLLaMA
🎙️Whisper
Google NotebookLM goes global with multilingual AI video summaries of your notes
techradar.com·10h
🏛Digital humanities
Google’s URL Context Grounding: Another Nail in RAG’s Coffin?
towardsdatascience.com·34m
🌀Brotli Internals
MTalk-Bench: Evaluating Speech-to-Speech Models in Multi-Turn Dialogues via Arena-style and Rubrics Protocols
arxiv.org·9h
🎙️Whisper
TaDiCodec: Text-aware Diffusion Speech Tokenizer for Speech Language Modeling
arxiv.org·9h
🎙️Whisper
A Universal Rhythm Guides How We Speak: Global Analysis Reveals 1.6-second Units
science.slashdot.org·1d
🎵Music Universality
Using Real Survey Data to Create Authentic AI Personas for Extended Research
askrally.com·51m·
Discuss: Hacker News
🏛Digital humanities
Speech-Based Depressive Mood Detection in the Presence of Multiple Sclerosis: A Cross-Corpus and Cross-Lingual Study
arxiv.org·9h
🎙️Whisper
LingVarBench: Benchmarking LLM for Automated Named Entity Recognition in Structured Synthetic Spoken Transcriptions
arxiv.org·1d
⚙️Compression Benchmarking
EyeMulator: Improving Code Language Models by Mimicking Human Visual Attention
arxiv.org·9h
📊Feed Optimization
Using Gemini prompts for Suno's Cover/Remix helps unblock creative projects
backpocketmusic.com·13h·
Discuss: Hacker News
🎧Learned Audio
DocHop-QA: Towards Multi-Hop Reasoning over Multimodal Document Collections
arxiv.org·1d
📇Dublin Core
Paradigms of Intelligence Team
github.com·21m·
Discuss: Hacker News
🔲Cellular Automata
Constrained Prompt Enhancement for Improving Zero-Shot Generalization of Vision-Language Models
arxiv.org·9h
🧠Neural Codecs
Dissonance: A journey through musical possibility space
aatishb.com·2d·
Discuss: Hacker News
🌈Spectral Audio
On the Algorithmic Bias of Aligning Large Language Models with RLHF: Preference Collapse and Matching Regularization
arxiv.org·9h
🧮Kolmogorov Bounds
vLLM on x86: Because Not Everyone Can Afford a GPU Cluster
dev.to·3h·
Discuss: DEV
⚡Homebrew CPUs
Learning JavaScript Promises the Feynman Way (With AI Assistance)
jakeworth.com·25m·
Discuss: Hacker News
⚔️Lean Tactics
Mini-Omni-Reasoner: Token-Level Thinking-in-Speaking in Large Speech Models
arxiv.org·1d
🎙️Whisper
Loading...Loading more...
AboutBlogChangelogRoadmap